Deaths from undetected COVID-19 infections as a fraction of COVID-19 deaths can be used for early detection of an upcoming epidemic wave

With countries across the world facing repeated epidemic waves, it becomes critical to monitor, mitigate and prevent subsequent waves. Common indicators like active case numbers may not be sensitive enough in the presence of systemic inefficiencies like insufficient testing or contact tracing. Test positivity rates are sensitive to testing strategies and cannot estimate the extent of undetected cases. Reproductive numbers estimated from logarithms of new incidences are inaccurate in dynamic scenarios and not sensitive enough to capture changes in efficiencies. Systemic fatigue results in lower testing, inefficient tracing and quarantining thereby precipitating the onset of the epidemic wave. We propose a novel indicator for detecting the slippage of test-trace efficiency based on the number of deaths/hospitalizations resulting from known and hitherto unknown infections. This can also be used to forecast an epidemic wave that is advanced or exacerbated due to a drop in efficiency in situations where the testing has come down drastically and contact tracing is virtually nil as is prevalent currently. Using a modified SEIRD epidemic simulator we show that (i) Ratio of deaths/hospitalizations from an undetected infection to total deaths converges to a measure of systemic test-trace inefficiency. (ii) This index forecasts the slippage in efficiency earlier than other known metrics. (iii) Mitigation triggered by this index helps reduce peak active caseload and eventual deaths. Deaths/hospitalizations accurately track the systemic inefficiencies and detect latent cases. Based on these results we make a strong case that administrations use this metric in the ensemble of indicators. Further, hospitals may need to be mandated to distinctly register deaths/hospitalizations due to previously undetected infections. Thus the proposed metric is an ideal indicator of an epidemic wave that poses the least socio-economic cost while keeping the surveillance robust during periods of pandemic fatigue.


Beta
We have used an epidemiological simulator from [4] that defines most of these parameters as mentioned in the table below. Table 1 Parameter Value inf_t -mean infection time These parameters are tunable and hence can be used for any epidemic where their values might be different. We have configured it to fit the covid19 data obtained from https://covid19tracker.in/. Some biological parameters like duration of immunity post recovery, change in morbidity post reinfections etc can be analyzed through sero-surveys but the data for that is sparsely sampled and not easily available. Further the transmissibility of infection is different in different age groups and in asymptomatic individuals and there can also be some unknown pathways for the spread of infections [2]. We have not modeled these parameters owing to their limited access.
Addressing this comment we have added these points in the Introduction section

Q2: It seems that there is a difference between the modeling estimates and real-world results in the study. I kindly recommend the authors compare the real data (COVID-19 data) with simulation results in their modeling study to investigate the accuracy of the simulation. For example, in their figures, the authors have estimated the pandemic with a single peak in 200 days; whereas in the real world, we have seen several peaks within such period. And each peak has different parameter quantities.
Thank you very much for your valuable feedback. We have incorporated your comments in our simulation of the real world data of covid 19 in India and have presented the results in the "Performance of D_ratio and other metrics on real world data" subsection. This helped us validate our hypothesis on a larger real world data.  giving the administrative bodies close to 25-30 days to prepare for an upcoming wave and also to bring in interventions by keeping a tab on the underlying model parameters.

Q3: Also, it is not clear how the authors have calculated the number of undetected cases (D) and what is their rational to do so in different types of pandemics.
Undetected cases in our model are obtained from simulations. Undetected cases are typically estimated from techniques like PCR and antibody tests. [5] describes how there are very few strategies to identify the number of undetected cases in the population, most popular ones being PCR and antibody tests. These tests sample the population randomly and only estimate the undetected cases in a specific population, at a specific time point. Both these techniques are very sparsely sampled, forcing us to fallback on simulations for estimating the undetected cases in the population. We have estimated the populations of each compartment of the epidemic in the SEIRD model defined by the differential equations in [4]. In our experiment we have fit the model to actual covid19 data in different Indian states, thereby matching the simulated cases to confirmed cases, recoveries and deaths, hence making our estimate of undetected cases also realistic in a fixed population.
We use the undetected cases to predict an epidemic wave much earlier than other known metrics like active cases or reproductive numbers. Hospitalizations and deaths are unmissable events in an epidemic and are better indicators of growth in an epidemic [6]. We use this information to derive the D_ratio metric that can foresee an epidemic wave.
We have addressed this comment in the introduction section.

Q4: More importantly, we use any indicator for a specific purpose. However, in the current study, it is not clear for what specific purpose the proposed indicator should be used. Comparing the different indicators depends on the purpose.
Popular indicators used to analyze the case growth are Reproductive number, active cases, daily new cases and test positivity rates. [7] Analyses the case growth pattern in Indian states by clustering and ranking them based on severity using metrics like number of active cases and death rates.But this technique needs enough data, and hence a considerable time into the epidemic would have elapsed by the time we get some insights. [8] discusses the correlation between reproductive rate(Rt) and Test positivity rates in different parts of the world. The authors show that a strong positive correlation between Test positivity rate and Rt in India indicates that testing was insufficient to catch up with the increasing cases. Even though test positivity rate correlates with Reproductive number R, and indicates an impending wave, it is highly biased towards the testing strategies in different places (different states) and the cooperation by the population. This induces variance in the measure making it a little unreliable.
Further contact tracing is also biased to the serial interval in symptomatic people [2] .Hence there is a need for a new metric that overcomes these shortcomings and predicts a wave much earlier than the described ones. We propose that D_ratio is one such metric that is not sensitive to testing strategies and can be estimated from simulations that gives us sufficient time to plan mitigation measures.
Addressing this comment, we have added these points in the introduction section.

Q5: I also suggest the authors strengthen their paper by using a few more references and discussing their results, comparing them with other studies in similar contexts.
We have added the below references in addition to the existing references.

Reviewer #2: [1] This article uses a modified SEIRD model to show that the ratio of deaths and hospitalizations from an undetected infection to total deaths converges to a measure of systemic test-trace inefficiency. I think there is enough new content in this paper to distinguish it from other works, considering that in recent years many articles have been published on the use of mathematical models for the dynamics of the COVID-19 pandemic, but without any practical utility. Q6: However, before proposing that policy makers and health administrators should consider including the Dratio metric as part of their decision-making framework, it is necessary that this paper is not merely a mathematical application of a model but also considers important biological assumptions.
Thank you very much for your constructive feedback. We have incorporated your suggestions in Introduction and Discussion sections as well as in the answer below. The covid19 pandemic is also characterized by many biological parameters like transmissibility across age groups, The effect of vaccination, and different epidemiological parameters in different strains of the virus etc. But most of these biological parameters are not easy to study in an ongoing epidemic [6].
These parameters are obtained largely from sero-surveys, virological testing which are sparsely sampled. Models that are fit to confirmed cases are sensitive to testing strategies. Hence the models fit to deaths and hospitalizations are more reliable [6] . Our model doesn't account for the immunological changes, transmissibility changes due to reinfections [2], vaccination and mutations of viral strains. Further we have enlisted the model assumptions and limitations in (Q 8) in detail.

Q7: In a recent article, Jones and Strigul (2021) hypothesized that the unpredictability of the COVID-19 pandemic could be a fundamental property if the disease spread is a chaotic dynamical system. This means that the change in daily numbers of COVID-19 is affected by a very large number of factors, such as the population's adherence to prevention measures, vaccination, social isolation, and new variants of the virus. In addition, in a recent letter published in the Journal of Medical Virology, Divino and colleagues advert that models used to generate predictions, scenarios, and projections about COVID-19 infections and hospitalizations, are unreliable (https://doi.org/10.1002/jmv.27325).
Jones and Strigul (2021) [9] have compared the growth in different countries with different underlying parameters resulting in a chaotic dynamical system like behavior. But in this work we have set out to provide the administrative bodies an aid to make specific reforms suited for curtailing the growth of epidemic in a particular state where there is less variance in the underlying parameters. The model parameters beta,c,q and infections from influx of migrant population are considered as piecewise constant values. Also since all these parameters are inferred from the same field data, we believe that unpredictability due to chaotic dynamical systems affects both the Death ratios and active cases equally. Hence we can still use the metric regardless of the chaotic behavior. From the results in the paper, we see that the model is fit to real world data in 3 different Indian states highlighting that the model is configurable to a specific region and uses actual field data to analyze the underlying parameters unlike a general set of models that Divino and colleagues describe in their letter [10]. Further [6] states that deaths and hospitalization data are more reliable compared to projections made on the basis of reproductive rates. Hence we demonstrate the utility of a metric derived from the number of deaths in a population to predict a wave.
We have addressed these comments in the Discussions section

Q8: Therefore, the authors must describe the scope and potential limitations of the proposed model from a clinical and epidemiological, not only mathematical, perspective, justifying how these exogenous variables can affect the results obtained.
1. Following an SEIRD model, our model assumes that a person infected once will not be re-infected.
2. Localized effects of change in the dynamics are not captured by the model. It means that there could be a denser spread of cases in certain districts while sparser cases in most other districts.
3. Changing mortality rates with vaccinations, strains etc are not accounted for in the model. The effect of changing mortality is visible in the third wave that can be seen in the error between the predicted number of cases and the observed number of cases when fitted to real world data. Despite this discrepancy we observe that the utility of the metric D_ratio is still valid.
Addressing the comment, we have added these points in the Discussion section. We believe that this metric empowers the administration and hospitals to introduce timely interventions, plan and strategize better resource allocation.
Addressing the comment, we have added these points in the Discussion section.

Q10: Please justify the parameters shown in
We arrived at the seed parameters from the covid19 data till 27th April 2020 to ensure that the determined parameters are epidemiologically plausible. It was earlier estimated that mean infection time of covid 19 was 9-10 days and mean latency period was 1 week. Population size of 10 million is chosen to keep it realistic to the population of an Indian city. The other parameters like beta and influx of migrant population and their probability of spreading infections are considered to be time varying in the course of the epidemic. Starting with these parameters as the seed, these time varying parameters were modified to simulate and validate the purpose of the D_ratio metric in our study. [ Table 1] We have added these details in the methods section of the paper. Table 1.

Q11: Please provide a sensitivity analysis considering changes in the parameters shown in
Thank you very much for your suggestion. We have addressed this comment in the subsection titled Sensitivity of D_ratio to model parameters in the Results section. This has helped us characterize and describe the model better. From the table [Table 1] we can largely classify the parameters as static parameters that don't change in the course of epidemic simulation and dynamic parameters that are time varying and are manipulated throughout the simulation to fit the model to some particular data. The sensitivity analysis has been performed on these static parameters namely the mean infection time (inf_t), the mean latency time (lat_t), population, and fraction of infections through direct contact (a). It was observed that while the model was sensitive to mean infection time and mean latency period, population and a did not affect the model with respect to the metric D_ratio. The sensitivity analysis summary is presented in the figure below [Fig 4]. From this table it can be observed that the D_ratio is not effective when the mean infection times are very high compared to mean latency times and such a case hasn't been encountered so far.